PARSEME-It Corpus An annotated Corpus of Verbal Multiword Expressions in Italian
نویسندگان
چکیده
English. This paper describes a new language resource annotated with verbal multiword expressions (VMWEs) in Italian. The paper discusses the state of the art in VMWE identification and annotation in Italian, the methodology adopted, the various VMWE categories annotated, the corpus and the annotation process. Finally, the paper ends with results, conclusion and future work. Italiano. Questo contributo descrive una nuova risorsa linguistica annotata con polirematiche verbali per la lingua italiana. Viene presentato lo stato dell’arte relativamente all’identificazione ed all’annotazione di polirematiche per la lingua italiana, la metodologia adottata, le diverse categorie di polirematiche verbali annotate nel corpus, il corpus stesso e il processo di annotazione. Infine vengono illustrati i risultati ottenuti, le conclusioni e le prospettive future.
منابع مشابه
A data-driven approach to verbal multiword expression detection. PARSEME Shared Task system description paper
Multiword expressions are groups of words acting as a morphologic, syntactic and semantic unit in linguistic analysis. Verbal multiword expressions represent a subgroup of multiword expressions, namely that in which a verb is the syntactic head of the group considered in its canonical (or dictionary) form. All multiword expressions are a great challenge for natural language processing, but the ...
متن کاملThe PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions
Multiword expressions (MWEs) are known as a “pain in the neck” for NLP due to their idiosyncratic behaviour. While some categories of MWEs have been addressed by many studies, verbal MWEs (VMWEs), such as to take a decision, to break one’s heart or to turn off, have been rarely modelled. This is notably due to their syntactic variability, which hinders treating them as “words with spaces”. We d...
متن کاملExtracting Verbal Multiword Data from Rich Treebank Annotation
The PARSEME Shared Task on automatic identification of verbal multiword expressions aims at identifying such expressions in running texts. Typology of verbal multiword expressions, very detailed annotation guidelines and gold-standard data for as many languages as possible will be provided. Since the Prague Dependency Treebank includes Czech multiword expression annotation, it was natural to ma...
متن کاملBuilding an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBuilding an Arabic Multiword Expressions RepositoryBulding an Arabic Multiword Expressions Repository
We introduce a list of Arabic multiword expressions (MWE) collected from various dictionaries. The MWEs are grouped based on their syntactic type. Every constituent word in the expressions is manually annotated with its full context-sensitive morphological analysis. Some of the expressions contain semantic variables as place holders for words that play the same semantic role. In addition, we ha...
متن کاملA French Corpus Annotated for Multiword Expressions with Adverbial Function
This paper presents a French corpus annotated for multiword expressions (MWEs) with adverbial function. This corpus is designed for investigation on information retrieval and extraction, as well as on deep and shallow syntactic parsing. We delimit which kind of MWEs we annotated, we describe the resources and methods we used for the annotation, and we briefly comment the results. The annotated ...
متن کامل